14 research outputs found

    A trust region-type normal map-based semismooth Newton method for nonsmooth nonconvex composite optimization

    Full text link
    We propose a novel trust region method for solving a class of nonsmooth and nonconvex composite-type optimization problems. The approach embeds inexact semismooth Newton steps for finding zeros of a normal map-based stationarity measure for the problem in a trust region framework. Based on a new merit function and acceptance mechanism, global convergence and transition to fast local q-superlinear convergence are established under standard conditions. In addition, we verify that the proposed trust region globalization is compatible with the Kurdyka-{\L}ojasiewicz (KL) inequality yielding finer convergence results. We further derive new normal map-based representations of the associated second-order optimality conditions that have direct connections to the local assumptions required for fast convergence. Finally, we study the behavior of our algorithm when the Hessian matrix of the smooth part of the objective function is approximated by BFGS updates. We successfully link the KL theory, properties of the BFGS approximations, and a Dennis-Mor{\'e}-type condition to show superlinear convergence of the quasi-Newton version of our method. Numerical experiments on sparse logistic regression and image compression illustrate the efficiency of the proposed algorithm.Comment: 56 page

    Variational Properties of Decomposable Functions Part II: Strong Second-Order Theory

    Full text link
    Local superlinear convergence of the semismooth Newton method usually requires the uniform invertibility of the generalized Jacobian matrix, e.g. BD-regularity or CD-regularity. For several types of nonlinear programming and composite-type optimization problems -- for which the generalized Jacobian of the stationary equation can be calculated explicitly -- this is characterized by the strong second-order sufficient condition. However, general characterizations are still not well understood. In this paper, we propose a strong second-order sufficient condition (SSOSC) for composite problems whose nonsmooth part has a generalized conic-quadratic second subderivative. We then discuss the relationship between the SSOSC and another second order-type condition that involves the generalized Jacobians of the normal map. In particular, these two conditions are equivalent under certain structural assumptions on the generalized Jacobian matrix of the proximity operator. Next, we verify these structural assumptions for C2C^2-strictly decomposable functions via analyzing their second-order variational properties under additional geometric assumptions on the support set of the decomposition pair. Finally, we show that the SSOSC is further equivalent to the strong metric regularity condition of the subdifferential, the normal map, and the natural residual. Counterexamples illustrate the necessity of our assumptions.Comment: 28 pages; preliminary draf

    A Semismooth Newton Stochastic Proximal Point Algorithm with Variance Reduction

    Full text link
    We develop an implementable stochastic proximal point (SPP) method for a class of weakly convex, composite optimization problems. The proposed stochastic proximal point algorithm incorporates a variance reduction mechanism and the resulting SPP updates are solved using an inexact semismooth Newton framework. We establish detailed convergence results that take the inexactness of the SPP steps into account and that are in accordance with existing convergence guarantees of (proximal) stochastic variance-reduced gradient methods. Numerical experiments show that the proposed algorithm competes favorably with other state-of-the-art methods and achieves higher robustness with respect to the step size selection

    Convergence of Random Reshuffling Under The Kurdyka-{\L}ojasiewicz Inequality

    Full text link
    We study the random reshuffling (RR) method for smooth nonconvex optimization problems with a finite-sum structure. Though this method is widely utilized in practice such as the training of neural networks, its convergence behavior is only understood in several limited settings. In this paper, under the well-known Kurdyka-Lojasiewicz (KL) inequality, we establish strong limit-point convergence results for RR with appropriate diminishing step sizes, namely, the whole sequence of iterates generated by RR is convergent and converges to a single stationary point in an almost sure sense. In addition, we derive the corresponding rate of convergence, depending on the KL exponent and the suitably selected diminishing step sizes. When the KL exponent lies in [0,12][0,\frac12], the convergence is at a rate of O(tβˆ’1)\mathcal{O}(t^{-1}) with tt counting the iteration number. When the KL exponent belongs to (12,1)(\frac12,1), our derived convergence rate is of the form O(tβˆ’q)\mathcal{O}(t^{-q}) with q∈(0,1)q\in (0,1) depending on the KL exponent. The standard KL inequality-based convergence analysis framework only applies to algorithms with a certain descent property. We conduct a novel convergence analysis for the non-descent RR method with diminishing step sizes based on the KL inequality, which generalizes the standard KL framework. We summarize our main steps and core ideas in an informal analysis framework, which is of independent interest. As a direct application of this framework, we also establish similar strong limit-point convergence results for the reshuffled proximal point method.Comment: 23 page

    Nonmonotone globalization for Anderson acceleration via adaptive regularization

    Get PDF
    Anderson acceleration (AA) is a popular method for accelerating fixed-point iterations, but may suffer from instability and stagnation. We propose a globalization method for AA to improve stability and achieve unified global and local convergence. Unlike existing AA globalization approaches that rely on safeguarding operations and might hinder fast local convergence, we adopt a nonmonotone trust-region framework and introduce an adaptive quadratic regularization together with a tailored acceptance mechanism. We prove global convergence and show that our algorithm attains the same local convergence as AA under appropriate assumptions. The effectiveness of our method is demonstrated in several numerical experiments
    corecore